Search CORE

56 research outputs found

Curriculum Learning for Handwritten Text Line Recognition

Author: Kermorvant Christopher
Louradour Jérôme
Publication venue
Publication date: 05/12/2013
Field of study

Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition. At the same time, learning RNN by gradient descent leads to slow convergence, and training times are particularly long when the training database consists of full lines of text. In this paper, we propose an easy way to accelerate stochastic gradient descent in this set-up, and in the general context of learning to recognize sequences. The principle is called Curriculum Learning, or shaping. The idea is to first learn to recognize short sequences before training on all available training sequences. Experiments on three different handwritten text databases (Rimes, IAM, OpenHaRT) show that a simple implementation of this strategy can significantly speed up the training of RNN for Text Recognition, and even significantly improve performance in some cases

arXiv.org e-Print Archive

Crossref

Multiple Document Datasets Pre-training Improves Text Line Detection With Deep Neural Networks

Author: Boillet Mélodie
Kermorvant Christopher
Paquet Thierry
Publication venue
Publication date: 10/01/2021
Field of study

In this paper, we introduce a fully convolutional network for the document layout analysis task. While state-of-the-art methods are using models pre-trained on natural scene images, our method Doc-UFCN relies on a U-shaped model trained from scratch for detecting objects from historical documents. We consider the line segmentation task and more generally the layout analysis problem as a pixel-wise classification task then our model outputs a pixel-labeling of the input images. We show that Doc-UFCN outperforms state-of-the-art methods on various datasets and also demonstrate that the pre-trained parts on natural scene images are not required to reach good results. In addition, we show that pre-training on multiple document datasets can improve the performances. We evaluate the models using various metrics to have a fair and complete comparison between the methods

arXiv.org e-Print Archive

HAL - Normandie Université

Key-value information extraction from full handwritten pages

Author: Boillet Mélodie
Kermorvant Christopher
Tarride Solène
Publication venue
Publication date: 26/04/2023
Field of study

We propose a Transformer-based approach for information extraction from digitized handwritten documents. Our approach combines, in a single model, the different steps that were so far performed by separate models: feature extraction, handwriting recognition and named entity recognition. We compare this integrated approach with traditional two-stage methods that perform handwriting recognition before named entity recognition, and present results at different levels: line, paragraph, and page. Our experiments show that attention-based models are especially interesting when applied on full pages, as they do not require any prior segmentation step. Finally, we show that they are able to learn from key-value annotations: a list of important words with their corresponding named entities. We compare our models to state-of-the-art methods on three public databases (IAM, ESPOSALLES, and POPP) and outperform previous performances on all three datasets

arXiv.org e-Print Archive

IA au service de l\u27indexation des contenus en bibliothèque (L\u27)

Author: Kermorvant Christopher
Publication venue: enssib
Publication date
Field of study

Diaporama de l\u27intervention de Christopher Kermorvant dans le cadre de la Biennale du numérique 2023 " Intelligence artificielle : écosystèmes, enjeux, usages

Bibliothèque numérique de l'enssib

A comparison of two strategies for ASR in additive noise : Missing Data and Spectral Subtraction

Author: Kermorvant Christopher
Morris Andrew
Publication venue: IDIAP
Publication date: 10/03/2006
Field of study

Infoscience - École polytechnique fédérale de Lausanne

SIMARA: a database for key-value information extraction from full pages

Author: Boillet Mélodie
Kermorvant Christopher
Moufflet Jean-François
Tarride Solène
Publication venue
Publication date: 26/04/2023
Field of study

We propose a new database for information extraction from historical handwritten documents. The corpus includes 5,393 finding aids from six different series, dating from the 18th-20th centuries. Finding aids are handwritten documents that contain metadata describing older archives. They are stored in the National Archives of France and are used by archivists to identify and find archival documents. Each document is annotated at page-level, and contains seven fields to retrieve. The localization of each field is not available in such a way that this dataset encourages research on segmentation-free systems for information extraction. We propose a model based on the Transformer architecture trained for end-to-end information extraction and provide three sets for training, validation and testing, to ensure fair comparison with future works. The database is freely accessible at https://zenodo.org/record/7868059

arXiv.org e-Print Archive

Large-scale genealogical information extraction from handwritten Quebec parish records

Author: Boillet Mélodie
Capel Eugénie
Kermorvant Christopher
Maarand Martin
McGrath James
Tarride Solène
Vézina Hélène
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

This paper presents a complete workflow designed for extracting information from Quebec handwritten parish registers. The acts in these documents contain individual and family information highly valuable for genetic, demographic and social studies of the Quebec population. From an image of parish records, our workflow is able to identify the acts and extract personal information. The workflow is divided into successive steps: page classification, text line detection, handwritten text recognition, named entity recognition and act detection and classification. For all these steps, different machine learning models are compared. Once the information is extracted, validation rules designed by experts are then applied to standardize the extracted information and ensure its consistency with the type of act (birth, marriage and death). This validation step is able to reject records that are considered invalid or merged. The full workflow has been used to process over two million pages of Quebec parish registers from the 19–20th centuries. On a sample comprising 65% of registers, 3.2 million acts were recognized. Verification of the birth and death acts from this sample shows that 74% of them are considered complete and valid. These records will be integrated into the BALSAC database and linked together to recreate family and genealogical relations at large scale

Constellation

Landscape Analysis for the Specimen Data Refinery

Author: Bánki Olaf
Cubey Robert
Drinkwater Robyn
Englund Markus
Goble Carole
Groom Quentin
Kermorvant Christopher
Livermore Laurence
Rey Isabel
Santos Celia
Scott Ben
Walton Stephanie
Williams Alan
Wu Zhengzhe
Publication venue
Publication date: 01/01/2020
Field of study

This report reviews the current state-of-the-art applied approaches on automated tools, services and workflows for extracting information from images of natural history specimens and their labels. We consider the potential for repurposing existing tools, including workflow management systems; and areas where more development is required. This paper was written as part of the SYNTHESYS+ project for software development teams and informatics teams working on new software-based approaches to improve mass digitisation of natural history specimens

ZENODO

The University of Manchester - Institutional Repository

Digital.CSIC

ARPHA OAI-PMH Endpoint

ARPHA Preprints

A Comparison of Noise Reduction Techniques for Robust Speech Recognition

Author: C. Kermorvant
Christopher Kermorvant
Publication venue
Publication date
Field of study

. This report presents the integration of several noise reduction methods into the frontend for speech recognition developed at IDIAP. The chosen methods are : Spectral Subtraction, Cepstral Mean Subtraction and Blind Equalization. These dierent methods are studied from a theoretical point of view. Their implementation is described and they are tested on the Numbers95 speech database. A good noise robustness is obtained by combining two of these methods, like Spectral Subtraction with Cepstral Mean Subtraction or Spectral Subtraction with Blind Equalization. The later combination is found to be more appropriate for real recognition systems since it is frame synchronous. A comparison with Jah-RASTA-PLP is also given. Acknowledgements: The support of the OFES under the grant for the \Speech, Hearing and Recognition" (SPHEAR) project # OFES 970299 is gratefully acknowledged. The work described in this report beneted from fruitful discussions with Chac Mokbel. IDIAP{RR 99-10 1 Content..

CiteSeerX